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ABSTRACT 



In Search Based Software Engineering, Genetic Program- 
ming has been used for bug fixing, performance improve- 
ment and parallelisation of programs through the modifi- 
cation of source code. Where an evolutionary computation 
algorithm, such as Genetic Programming, is to be applied to 
similar code manipulation tasks, the complexity and size of 
source code for real-world software poses a scalability prob- 
lem. To address this, we intend to inspect how the Software 
Engineering concepts of modularity, granularity and locali- 
sation of change can be reformulated as additional mecha- 
nisms within a Genetic Programming algorithm. 

Categories and Subject Descriptors 

1.2.2 [Artificial Intelligence]: Automatic Programming 
; D.1.2 [Software]: Automatic Programming 

General Terms 

Algorithms 
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1. INTRODUCTION 

Automating programming tasks has long been a goal in 
Computer Science. A current incarnation of this effort has 
been termed Search Based Software Engineering (SBSE) [9]. 
SBSE seeks the automation of Software Engineering (SE) 
tasks by posing them as search problems. Various search 
algorithms which are classified by the term Evolutionary 
Computation (EC) [7] have been used to perform software 
source code manipulation for various purposes |20j . Exam- 
ples range from parallelisation [21] and generation of source 
code [2] to bug fixing [23]. 
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An issue with the application of EC techniques to the 
modification of source code is designing algorithms that can 
scale with increased code size and complexity. Increased 
size and complexity increases the number of solutions that 
an algorithm must search through to find a solution. The 
algorithms are further impeded due to a "bloat" effect where 
the solutions produced generally expand in size as the search 
process progresses limiting the chances of finding the best 
solution. 

Modularisation techniques have been proposed to scale 
EC techniques which allow the algorithm to define and reuse 
its own functions during its operation 14 . Another ap- 
proach involves the use of static code analysis before the 
algorithm is applied to discover likely influential locations 
within code [24] . The analysis produces a probability over- 
lay over the source code which focuses the algorithms oper- 
ations to likely useful locations. 

Our approach draws on current SE principles which de- 
scribe software as complex and constantly changing [26| . 
Many software architectures are developed to aid software 
change by adding different forms of modularity [5] 1121 1101 
116] . Finer grained modularity and encapsulation facilitates 
localised modification and replacement of software elements 
by introducing more points for modification into the soft- 
ware [11] , While these software practices and architectures 
aid developers in building software for change, the modifi- 
cation of software source code remains a complex task. 

An EC algorithm can be improved in efficiency and so- 
lution finding ability by localising computational modifica- 
tions to where they are most likely needed [24]. Software 
can be modularised at various granularities as specified by a 
developer. This is generally driven by the recommendation 
for low coupling and high cohesion in software. Given a good 
modularisation at the right granularity, software changes can 
be implemented more easily. Making coarse changes would 
improve the scalability of the algorithm but may reduce the 
algorithms ability to find the best solution possible while 
making changes at a fine level, such as the language level 
instead of at the line of code, would allow a wider range 
of solutions to be generated but would slow down the algo- 
rithm. 

Our position is that the SE principles of localisation and 
granularity of change have relevance for further improving 
the scalability of EC specifically for source code manipu- 
lation. Both of these concepts are enabled by modularisa- 
tion of code. Instead of using EC algorithms to modularise 
software, we see the concept of modularisation as one that 
should be incorporated into EC algorithms themselves for 



the purposes of improving scalability [15]. Modularity is an 
SE concept which advocates that software should be struc- 
tured to be understandable and modifiable [22]. It gives 
developers a general architecture for building software so 
that future modifications of the software can be made as 
efficiently as possible. We wish to incorporate the ability 
to modularise code for change within an algorithm which is 
suited and beneficial to how the algorithm operates. 

We seek to apply modularisation to the problem of source 
code modification and ask: 

• Are the concepts of granularity and localisation bene- 
ficial when designing algorithms for source code mod- 
ification? 

The research goal that follows is to improve the scalability 
of an EC algorithm which modifies source code to change 
non-functional characteristics of software. Our work seeks 
to understand how localising change at varying granularity 
can improve the modification of source code using EC. 

2. RELATED WORK 

An issue with generating software using EC methods is 
that size and complexity increases the size of the search 
space thereby reducing the probability of successfully find- 
ing the best solution [6] [4]. Within the EC literature this 
is posed as a scalability problem. When these techniques 
are applied to software generation it limits the applicability 
of the technique when considering large scale software sys- 
tems. Although modification of existing software is an easier 
problem for EC techniques to solve than generation, it still 
suffers from the problem of scalability. 

EC has been used to refactor software for improvement 
of software quality by use of a measure for understandabil- 
ity [18]. The metrics used in this and other work include 
software code quality measures for characteristics such as 
cohesion and coupling and has been termed Software Mod- 
ule Clustering [19]. The end goal of module clustering re- 
search is to produce software that is better modularised for 
developers. Our approach sees modularisation as a mech- 
anism which can improve the scalability of an algorithm, 
regardless of what this form of modularisation may mean 
for developers. 

Feldt applied algorithmic techniques to N-version program- 
ming [8] . The goal of N-version programming is to improve 
fault tolerance by producing multiple differing implementa- 
tions of the same functionality. As the implementations gen- 
erated must be entirely different from each other, there is no 
room for reuse of the original implementation for subsequent 
implementations. The software generation mechanism must 
generate software with the added stipulation that it must 
differ from the original which makes the problem harder. 

The most recent example of automated approaches that 
operate directly on existing source code that can be con- 
sidered "real- world"; is work on bug fixing [23]. Forrest and 
Weimer have used CP with multiple evolutionary runs to fix 
certain classes of bugs. This approach uses a line of code as 
the granularity of change. Static code analysis is used to lo- 
calise modification of source code to areas that are likely to 
yield a fix. To further improve scalability of the technique, 
lines of code that are close in line number to these areas 
are more likely to be reused. The CP algorithm is applied 
multiple times indicating that this type of software change 
requires a relatively small number of changes to the code. 



Multiple runs are performed instead of a single longer run 
indicating that the rate of success of any one run is low. If 
a deeper or more widespread modification of the code were 
necessary, then multiple runs may not improve the likelihood 
of finding a solution. If a relatively small number of changes 
are sought to fix a particular bug, this poses the question as 
to how much of the code must be modified and what types 
of modifications are required for different classes of patches 
and other software characteristics changes? It is not known 
how the granularity of source code they choose for modifica- 
tion affects the class of bug fixes attainable. This may have 
a restricting affect on the bugs that can be corrected with 
this approach. 

Our hypothesis is that this granularity has a positive effect 
on the scalability of the algorithm but restricts the number 
of possible solutions attainable by the algorithm. Further 
to this and evocative of our solution, it is not clear how 
dynamically varying granularity would impact scalability of 
the algorithm and range of modification, such as the per- 
formance improvement, for software. Forrest and Weimer's 
results show that the algorithm works well for certain classes 
of bugs in code of varying size. While granularity and local- 
isation have an impact on the scalability of the approach, 
the literature leaves some open questions around how each 
of these affect scalability individually and how they affect 
the solutions attainable. 

Arcuri has developed a co-evolutionary EC framework to 
generate and improve source code [2]. The technique has 
been applied to the generation of source code for functional 
(bug fixing) and non-functional (performance improvement) 
purposes. Particular attention has been paid to testing 
which works well with co-evolution. In the co-evolutionary 
approach, test cases and programs form two separate pop- 
ulations with each ones usefulness measure being allocated 
by interaction with the other. The number of tests passed 
is used as a measure of how useful a generated program is. 
The number of generated programs that fail a test is used 
as a measure of how useful the test is. This is similar to 
Forrest and Weimer's approach in that a series of true or 
false test cases are summed to give a measure of useful- 
ness. Conversely to Forrest and Weimer, White, Arcuri et 
al. improve a non-functional characteristic, performance, of 
a smaller code-base using a finer granularity without any lo- 
calisation [25] . The granularity used is at the language level 
in that the program is converted to an Abstract Syntax Tree 
which allows modification of individual language level con- 
structs such as variables and control structures. This is the 
finest granularity allowed by the language. How this granu- 
larity affects the scalability and the range of change achiev- 
able by the algorithm is unclear. We believe that the finer 
granularity allows a wider range of solutions to be found by 
the algorithm for smaller code-bases. This work shows the 
applicability of the approach for modifying non-functional 
characteristics but does not inspect how the approach can 
scale to larger systems. Arcuri has also improved software 
performance while maintaining functionality through the use 
of multi-objective optimisation [3]. 

By comparing Arcuri with Forrest and Weimer's work, the 
overall changes achievable are restricted to certain classes of 
bugs in one and a change in performance in the other. How 
granularity and localisation affects the algorithm is not di- 
rectly inspected in either approach. Weimer's research de- 
scribes the approaches applicability in a number of open 



source software code bases of varying size. Arcuri's work 
operates on a single model problem. While neither are con- 
clusive or directly comparable, Weimer and Forrest's work 
can be used as a rough indicator as to the approaches gen- 
erality and scalability when dealing with large amounts of 
code. The range of change achievable, such as the measur- 
able improvement in performance, is not comparable, nor 
can even a rough indicator be drawn from the results of 
these pieces of work. 

From this discussion, and to the best of our knowledge, 
we can conclude that the effects of localisation and granu- 
larity of change have not been inspected for their impact on 
the possible trade-off between scalability and solution find- 
ing ability. We feel that both these concepts can be used 
to improve the operation of EC algorithms when applied to 
source code modification by allowing fine grained changes 
to be made in the right locations. Where coarse grained 
changes are suited to a particular portion of code, an algo- 
rithm should be allowed to operate at this granularity. This 
is expected to provide an efficient use of computational ef- 
fort by focusing change in areas of the search space likely to 
contain good solutions without restricting the solution space 
through prohibiting possible solutions. This is our research 
focus. 

3. SOLUTION 

Our solution would ideally be able to take as input the 
source code for a program, a range of test data and a fitness 
function for testing the performance of the program. The 
solution would return a modified version of the source code 
which has been improved on a scale as defined by the fitness 
function. This modified and improved version would be ar- 
rived at by the application of GP where the source code is 
used as a seed for the initial population. A compiled ver- 
sion of the initial source code along with the test data would 
be used as an oracle for maintaining program functionality. 
The GP system would progress by making changes to it- 
erations of populations of source code programs using the 
fitness function to maintain a bias toward more improved 
programs. 

The type of change in software that is sought is the modifi- 
cation of characteristics that could be termed non-functional. 
This is a rough designation of the types of changes sought 
and is an attempt to specify any change that has a relatively 
simple metric associated with it. Simple metrics generally 
relate to non-functional characteristics such as computation 
time required. 

For the use of EC algorithms, a fitness function is required 
that can evaluate programs on a scale fine enough to indi- 
cate improvements after changes to a program. The scale 
should be a multi-valued monotonic metric where increases 
on the scale are linearly related to improvements in software 
characteristics. For functional behaviour, the generation of 
a scale which increases as the desired functionality improves 
is not trivial P3] and requires careful consideration and de- 
sign in itself. For this reason, "non-functional" characteris- 
tics will be used. Our solution should be able to implement 
a different range of change in these characteristics than is 
possible with a static approach. 

For small code segments, improvements can be made using 
the finest granularity as shown by Arcuri. As the code size 
increases, the chances of finding the same level of improve- 
ment is diminished. The finest granularity allows the largest 



number of possible solutions but does not scale well. Our 
concern is whether and how fine grained evolution can be 
scaled. Our hypothesis is that a focused variant of Genetic 
Programming via localisation of change may deliver the ap- 
propriate model. If our solution works as planned, we can 
automatically rewrite software to improve its performance. 
As the programs are evolved over time, the algorithm would 
be able to refine its focus within the code, making changes 
where needed. 

Our approach to achieving this is to reuse the information 
generated when an offspring individual is produced from 
parent individuals. We believe that this information can 
be used to infer and refine location and granularity indi- 
cators for future modifications as the algorithm progresses. 
These indicators take the form of values associated with ev- 
ery location within each individual program. The values are 
used to bias the selection of locations for modification by 
the GP operators. The values make up a probability mask 
over an individual where any location may be chosen for 
modification but each location has a different probability of 
being subsequently modified. Allowing any change enables 
a larger solution space while avoiding the scalability prob- 
lem this presents by guiding the GP algorithm toward likely 
solutions. These values are updated and passed along as the 
GP algorithm produces more individuals. 

The most basic form of our solution is explained when 
considering a form of single point mutation which modifies 
a program by only one line of code. After this operation we 
have the original program, its fitness value and the modified 
version with its fitness value. Assuming a difference in fit- 
ness is seen between these two programs it can be inferred 
that mutation at or near that location is influential to the 
fitness. If a particular location is found to be influential, the 
associated value can be increased, improving this locations 
chance of being modified during future mutations. If no fit- 
ness difference is found, the value may not be changed or 
decreased slightly. If the change causes the program to not 
compile, the decrease may be larger. 

As the algorithm progresses, a probabilistic mask of where 
change should and shouldn't be performed over the source 
code would emerge. Generating a mask can be achieved by 
other means, such as static code analysis a la Weimer, but 
our argument is that the accuracy of this analysis for guiding 
change reduces as more modifications are made to the code. 
The more modifications are made to a program, the more it 
departs from the original and the less useful the static mask 
becomes. 

We see useful change to software characteristics requiring 
a "deep" modification exemplified by an improved program 
having departed widely from the original. Our hypothesis is 
that the complexity of overall change attainable with mul- 
tiple shorter runs using a static probability overlay is lim- 
ited in comparison to a longer dynamically updating overlay. 
These two approaches would find a different range of solu- 
tions. If our hypothesis is correct, our solution should be 
able to make more complex changes to software. From this 
point of view, our work is addressing the complexity limita- 
tions of GP when applied to software modification. 

Our approach is hoped to be more lightweight than re- 
peated application of static analysis during a GP run as it 
reuses information generated during the GP run itself. It is 
also hoped that our approach may be more expressive, being 
able to guide GP more generally than static code analysis 



which would analyse code with regard to specific character- 
sitics such as cpu time. 

Equipping the algorithm with multiple choices for gran- 
ularity of change, e.g. block of code, single line of code, 
this solution could be extended to infer the most influential 
granularity to be used at a specific location. The algorithm 
could make an inference about the granularity of change 
that has caused a fitness change. This may be used to infer 
this granularity is a good one for making changes near this 
location. 

Similar mechanisms have been described in the GP liter- 
ature |17l Q] and are expected to be especially useful for use 
on software and further extension in this area. Following 
on from this, we will develop the solution so it can address 
granularity. Having a modularisation that is based on prob- 
abilities evokes the notion that it is a "fuzzy" modularisation 
which is not rigidly defined like a function and may look dif- 
ferent every time a change is performed. Assuming we can 
experimentally validate how probabilities for change are allo- 
cated by the algorithm within individuals, a further inspec- 
tion could be performed into how a probabilistic modularisa- 
tion would compare with Automatically Defined Functions 
(ADF). The argument being that ADF's are too restrictive, 
eliminating some possible solutions from the search space 
through their strict structure. A modularisation that has 
its boundaries only probabilistically defined would not re- 
strict the solutions possible while improving scalability and 
guiding GP appropriately. This is less clearly defined as a 
solution and forms our future work. 
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